Sliding Windows with Limited Storage

نویسندگان

  • Paul Beame
  • Raphaël Clifford
  • Widad Machmouchi
چکیده

We consider time-space tradeoffs for exactly computing frequency moments and order statistics over sliding windows [16]. Given an input of length 2n− 1, the task is to output the function of each window of length n, giving n outputs in total. Computations over sliding windows are related to direct sum problems except that inputs to instances almost completely overlap. • We show an average case and randomized time-space tradeoff lower bound of T · S ∈ Ω(n) for multi-way branching programs, and hence standard RAM and word-RAM models, to compute the number of distinct elements, F0, in sliding windows over alphabet [n]. The same lower bound holds for computing the low-order bit of F0 and computing any frequency moment Fk for k 6= 1. We complement this lower bound with a T · S ∈ Õ(n) deterministic RAM algorithm for exactly computing Fk in sliding windows. • We show time-space separations between the complexity of sliding-window element distinctness and that of sliding-window F0 mod 2 computation. In particular for alphabet [n] there is a very simple errorless sliding-window algorithm for element distinctness that runs in O(n) time on average and uses O(log n) space. • We show that any algorithm for a single element distinctness instance can be extended to an algorithm for the sliding-window version of element distinctness with at most a polylogarithmic increase in the time-space product. • Finally, we show that the sliding-window computation of order statistics such as the maximum and minimum can be computed with only a logarithmic increase in time, but that a T · S ∈ Ω(n) lower bound holds for sliding-window computation of order statistics such as the median, a nearly linear increase in time when space is small. ∗This work was done while visiting The University of Washington, Seattle, WA.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Processing Sliding Window Multi-Joins in Continuous Queries over Data Streams

We study sliding window multi-join processing in continuous queries over data streams. Several algorithms are reported for performing continuous, incremental joins, under the assumption that all the sliding windows fit in main memory. The algorithms include multiway incremental nested loop joins (NLJs) and multi-way incremental hash joins. We also propose join ordering heuristics to minimize th...

متن کامل

On Indexing Sliding Windows over Online Data Streams

We consider indexing sliding windows in main memory over on-line data streams. Our proposed data structures and query semantics are based on a division of the sliding window into sub-windows. By classifying windowed operators according to their method of execution, we motivate the need for two types of windowed indices: those which provide a list of attribute values and their counts for answeri...

متن کامل

Efficient time-series subsequence matching using duality in constructing windows

In this paper, we propose a new subsequence matching method, Dual Match. Dual Match exploits duality in constructing windows and significantly improves performance. Dual Match divides data sequences into disjoint windows and the query sequence into sliding windows, and thus, is a dual approach of the one by Faloutsos et al. (Proceedings of the ACM SIGMOD International Conference on Management o...

متن کامل

Duality-Based Subsequence Matching in Time-Series Databases

In this papec we propose a new subsequence matching method, DualMatch, which exploits duality in constructing windows and significantly improves performance. Qual Match divides data sequences into disjoint windows and the query sequence into sliding windows, and thus, is a dual approach of the one by Faloutsos et al. (FRM in short), which divides data sequences into sliding windows and the quer...

متن کامل

Inferring Fine-Grained Data Provenance in Stream Data Processing: Reduced Storage Cost, High Accuracy

Fine-grained data provenance ensures reproducibility of results in decision making, process control and e-science applications. However, maintaining this provenance is challenging in stream data processing because of its massive storage consumption, especially with large overlapping sliding windows. In this paper, we propose an approach to infer fine-grained data provenance by using a temporal ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Electronic Colloquium on Computational Complexity (ECCC)

دوره 19  شماره 

صفحات  -

تاریخ انتشار 2012